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(54) Content addressable memory system 

(57) A system includes a plurality of content 
addressable memory (CAM) chips which are cascaded 
and connected to a common bus. Each of the CAM 
chips provides search results (hit, match address and 
multiple match). A hit signal and a multiple match signal 
are propagated from chip to chip. A system hit result is 



given from the furthest down stream CAM chip. The 
match address result of the system is given from the 
common bus, where on-chip self-timed signals guaran- 
tee that there is no driving contention on the bus. 
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Description 
TECHNICAL FIELD 

[0001] The present invention relates to a content addressable memory (CAM) system in which a plurality of CAM 
chips are cascade-connected. 

BACKGROUND INFORMATION 

[0002] In known CAMs, data is selected based on contents, rather than physical location. This function is useful for 
many applications, especially when performing a look-up for the purposes of mapping from a long identification word to 
a shorter word. This operation is required in many telecom functions, including Asynchronous Transfer Mode address 
translation. 

[0003] Often, system storage requirements exceed the number of entries stored on a single CAM chip. Multiple chips 
are then required, and it is necessary that a means be developed to cascade these multiple chips such that they may 
be searched as a single entity. An appropriate "user-friendly" cascading capability enables the same chip to be used in 
a range of systems with different capacity requirements, and allows for easy expandability and scalability, as well. 
[0004] United States Patent No. 5,568,416 granted to K. Kawana et at on October 22, 1996 discloses an associative 
memory in which multiple CAM chips are cascaded by propagating result address and status through all chips in the 
cascade. Each chip contains a status register for itself, and another for all upstream chips. It also discloses means of 
identifying the last device in the cascade, and separate storage areas for common and unique data entries. 

SUMMARY OF THE INVENTION 

[0005] It is an object of the present invention to provide an improved content addressable memory system. 
[0006] According to one aspect of the present invention, there is provided a system comprising: a common bus; and 
a plurality of content addressable memory (CAM) chips which are cascaded and connected to the bus. each of the CAM 
chips comprising an array of core cells of w words x b bits and encoding means, the core cell comprising data storage 
means, the CAM chip being able to provide through the encoding means hit and match address signals resulted from 
search operation, each of the CAM chips comprising: means for propagating the hit signal from chip to chip; and means 
for providing the address signal to the bus. 

[0007] Preferably the CAM chip further comprises driving means for deciding which CAM chip is allowed to provide 
the address signal to the bus. Preferably the driving means comprises means for preventing more than one CAM chip 
from providing the address signal to the bus simultaneously. Alternatively, the chip comprises means for causing the 
driving means to be disabled based on signals generated on-chip and enabled by a signal propagating from upstream 
in the cascade. 

[0008] Preferably the system comprises means for observing the search results for the cascaded CAM chips from the 
hit signal of the furthest-downstream chip and the match address signal on the bus. 

[0009] Preferably the system comprises means for determining the priority of chips in the cascaded CAM chips by 
position, such that the further upstream a chip is, the higher the priority. 

[001 0] Preferably each of the CAM chips further provides multiple match indication on search operation through the 
encoding means and further comprises means for propagating the multiple match indication from chip to chip. Prefera- 
bly the system comprises means for observing a multiple match status of the cascaded CAM chips at a multiple match 
output of the furthest-downstream chip. 

[001 1] Preferably the CAM chip further comprises means for providing the match address signal to the bus in a 
sequence of subsequent cycles in a case of more than one match address. 

[0012] Preferably the system comprises means for encoding the ordinal location of the highest-priority chip with a 
match from among the cascaded multiple CAM chips. Preferably the system comprises means for providng the match 
address signals, encoded ordinal location, and/or associated data. 

[001 3] Preferably the CAM chip further comprises self-timing means for generating a self -timed signal for the chfc's 
operation, in response to a clock signal. 

[0014] Preferably the self-timing means comprises means for making a first transition of the self-timed signal in 
response to the clock signal. 

[001 5] Preferably the self-timing means comprises means for making a second transition of the self-timed signal at 
such a time that search results on-chip are guaranteed to have been generated prior to the second transition. 
[001 6] Preferably the self-timing means comprises a time delay chain, with the self-timed signal a first edge initiated 
by the clock signal, and a second edge initiated by a delayed clock signal. 

[001 7] Preferably the self-timing means comprises a model match line in a NOR-type CAM array, which runs over a 
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model row composed of modified core cells, such that only the farthest bit from the encoding means is a mismatch, all 
others match, and the slowest possfole match line transition is generated. 

[001 8] Preferably the self-timing means comprises a model match line in a NOR-type model row, in one of a plurality 
of word slices, modelling delay through the encoding means, and initiating the rising edge of the self-timed signal, while 
5 the falling edge is initiated by a model global data line, which also indirectly initiates transitions on search result signals. 
[0019] Preferably the self -timing means comprises at least one model match line chain in a NAND-type CAM array, 
running through a model row, comprising modified core cells which always match, generating the slowest possible 
match line transition, the transition being a match. 

[0020] Preferably the self-timing means comprises a model match line in a NAND-type model row, in one of a plurality 
w of word slices, modelling delay through the encoding means, and initiating the rising edge of the self-timed signal, while 
the falling edge is initiated by a model global data line, which indirectly initiates transitions on search result signals. 
[0021 J Preferably the CAM chip further comprises tri-state drivers onto the common match address bus. with their 
enables logically controlled by a three-input AND gate, having as inputs: the self-timed signal, the inverted hit-propaga- 
tion input signal and the on-chip hit result. 
is [0022] Preferably the CAM chip further comprises means for determining the hit output of each CAM chip being log- 
ically determined as the logic OB of three signals: 

the inverted self -timed signal; 
the hit input signal; and 
20 the on-chip hit result. 

[0023] Preferably the system further comprises a four-input OR gate, wherein the self-timed signal being optionally 
decomposed into two inputs of the four-input OR gate, with one input being a delayed self-timed signal so as to widen 
the self-timed signal pulse to allow for the delay due to the combination of search results from among multiple on-chip 
25 CAM arrays. 

[0024] Preferably the CAM chip comprises a plurality of CAM arrays and the serf-timed signal is the logic OR of: 
a fast self-timed signal generated directly by one of the CAM arrays; and 

a slow self-timed signal generated by logic circuitry which models the delay experienced by a hit signal logic circuit. 

30 

[0025] In the system, a system hit signal resulted from search operation of the CAM chip is propagated from chip to 
chip. A system hit result is given from the down stream CAM chip and a system match address result is given from the 
common bus. The system functions as a single multi-chip CAM with n x w words, n being the number of the CAM chips. 
[0026] Each CAM chip may further comprise driving means for deciding the CAM chip allowed to provide the match 
35 address signal to the bus. The driving means comprises means for preventing more than one CAM chip from providing 
the address signal to the bus simultaneously. 

BRIEF DESCRIPTION OF THE DRAWINGS 

40 [0027] An embodiment of the present invention will now be described by way of example with reference to the accom- 
panying drawings in which: 

Figure 1 A illustrates conceptual view of a single CAM array and its output signals; 
Figure 1 B illustrates the implied location of the CAM array within a single CAM chip; 
45 Figure 1 C illustrates the simplest possible connection of the CAM array to chip pins; 

Figure 2 is a block diagram of a system including a plurality of CAM chips which are cascaded, according to an 
embodiment of the present invention; 

Figure 3 illustrates the system with circuitry for determining a bus driving CAM chip; 
Figure 4 is a timing chart showing self-timed signal and on-chip search results; 
so Figure 5 illustrates logic gates for implementing cascading the CAM array; 

Figure 6A is a timing chart showing relative timing of on-chip signals responsive to the transition from 0 to 0 of a 
propagation-in hit signal; 

Figure 6B is a timing chart showing relative timing of on-chip signals responsive to the transition from 1 to 0 of the 
propagation-in hit signal: 

55 Figure 6C is a timing chart showing relative timing of on-chip signals responsive to the transition from 0 to 1 of the 
propagation-in hit signal; 

Figure 6D is a timing chart showing relative timing of on-chip signals responsive to the transition from 1 to 1 of a 
propagation-in hit signal; 
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Figure 7 illustrates logic gates to implement multiple match propagation in the system; 

Figure 8 is a block diagram of a self -timed signal generator; 

Figure 9 is a circuit diagram of a first example of a CAM chip; 

Figure 10 is a circuit diagram of a second example of a CAM chip; 
5 Figure 1 1 is a circuit diagram of a third example of a CAM chip; 

Figure 12 illustrates a self-timed signal generator; 

Figure 13 is a circuit diagram of a fourth example of a CAM chip; 

Figure 14A illustrates a self-timed signal generator; 

Figure 14B illustrates a self-timed signal generator; 
10 Figure 15 is a block diagram of a fifth example of a CAM chip; 

Figure 16 illustrates logic gates to implement appropriate full-chip timing of a self-timed signal to multiple CAM 

arrays; and 

Figure 1 7 illustrates a logic gate to generate a propagation-out hit signal. 

is DETAILED DESCRIPTION 

[0028] It is desirable to implement a multi-chip CAM that has the same simple three result outputs as a single-chip 
CAM (i.e., hit, match address and multiple match). It is further desirable to allow simple expandability from a one-chip 
to n-chip system, using nothing but a plurality n of instances of the same chip. 

20 

I. Concept Of A CAM Array 

[0029] Figure 1 A conceptually represents a single CAM array. A CAM array 110 has three outputs: i.e., hit ht, multiple 
match mt and match address sa. Array outputs, as well as other on-chip signals are denoted in this disclosure by lower- 
25 case lettering. Signals which travel on and off chip (via pins) are denoted by upper-case lettering Hereinafter, whenever 
a CAM chip is shown, by implication the array is embedded on a CAM chip 120 as shown in Figure 1 B. The simplest 
possible connections between the array outputs and the chip outputs are shown in Figure 1 C. 

II. An Efficient Manner Of The Connection Of Multiple CAM Chips 

30 

[0030] It is the intent of this invention to enable the connection of a plurality of CAM chips in an efficient manner. 
[0031] To achieve cascadability with simple expandability, the following are requirements for chip design: 

(1 ) The multiple chips should be able to share as many control signals and buses as possible, to avoid the need for 
35 new board-level signals for each additional chip. 

(2) In order for the combination of multiple chips to appear as a single entity, the over-all search result should be 
available at some pre-determined location in a system in which a plurality of CAM chips are cascaded. This applies 
to the encoded match address, a hit indication, and a multiple match indication, if one is provided. If this capability 
(of providing the result at a pre-determined location) is provided, it will not be required to sequentially poll the mul- 

40 tipfe chips to determine the search result. 

(3) All chips in the cascade are required as identical, in terms of: 

(a) actual physical composition; 

(b) programmed capability, specifically "priority". A chip's priority should be inherently defined by its position in 
45 the cascade, and should not require programming of an on-chip register. 

(4) The number of signals driven from chip-to-chip in a cascaded fashion should be minimized: 

(a) The encoded match address is too wide to propagate in this fashion. 
so (b) The chip-to-chip "daisy-chain" signals should ideally have some meaning to the user, in addition to their util- 

ity in chip-to-chip signalling. 

(5) It is clear from the above design requirements that all of the chips in the cascade will be capable of driving their 
individual match address results onto a single bus: 

55 

(a) There should be no contention problems on this bus. 

(b) On-chip circuitry should determine which chip is to drive the bus; it is not required to have means for select- 
ing a chip for enabling. 
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III Maximization Of The Number Of Shared Signals 

[0032] To maximize the number of shared signals, it is proposed that all chips in the multi-chip cascaded CAM use: 

5 (1 ) A common input data bus for write data. 

(2) A common output data bus for read data 

(3) A common address bus for randomly addressable writes and reads. 

(4) A common set of mode control signals, to determine which operation (write, read, or search) is being performed 
on the multi-chip CAM, as a whole. 

w (5) A common input bus for the search input (or "comparand''). 

(6) A common output bus fa the search result, which is usually the encoded address of a match. Note that the 
"result" could also be a data item stored together with, and associated with, the comparand. It may also be a series 
of data items, either: 

is (a) multiple encoded addresses, in the case of a multiple match search outcome 

(b) multiple items of associated data 

(c) a combination of the above 

[0033] Note that the buses ( 1 ), (2), and (5) above may easily have dual or triple uses. 
20 [0034] The following deals with the search function of the cascaded CAM, as the sharing of buses and control signals 
for memory reads and writes is well understood and documented in the literature. 

IV Concent Of A Multi-Chip CAM System 

25 [0035] The system hit and multiple match results are available at the downstream end (the low priority end). 
[0036] The multiple match function need not necessarily be provided. 
[0037] The encoded address is available on the shared result bus. It may comprise: 

(a) The result, as determined by the particular chip enabled to drive the bus, and as described by the common out- 
30 put bus description (6) above. 

(b) The result plus an encoded address uniquely identifying the selected chip. 

[0038] This encoded address need not propagate through multiple chips. 

[0039] All chips are identical (aside from any optional identification encoding capability implied above), and priority is 
35 determined by location in the cascade: the further upstream, or the further to the left the higher the priority. A higher 
priority match disables a lower priority match from driving the result bus. 

[0040] The connection of hit, multiple match, propagation-in hit and propagation-in multiple match pins as shown in 
Figure 2 implements a "daisy chain". 

[0041 ] The observation of a given hit and multiple match pair will indicate the status of the entire system upstream (to 
40 the left) of that particular pair. 

V. Embodiment Of A Mutti-Chip CAM System 

[0042] Referring to Figure 2 which shows a system according to an embodiment of the present invention, the system 
45 includes n CAM chips 120 which are cascaded and each of the CAM chips 120 provides match address signals SA to 
a common shared bus 122. Each of the CAM chips 120 has hit and multiple match input terminals for receiving the hit 
and multiple match signals (off-chip signals) HTI and MTI, respectively, from the upstream CAM chip 120 and hit and 
multiple match output terminals for providing the hit and multiple match signals HT and MT, respectively, to the down- 
stream CAM chip 1 20. The hit and multiple match input terminals of the furthest upstream CAM chip 1 20 are connected 
so to logic 0 terminals. The CAM chip 120 has numerous variations which will be described later. The system hit and mul- 
tiple match results SHT and SMT are available at the far right side (the furthest downstream CAM chip 120). A clock 
generator 124 provides clock signals ck to the CAM chips 120. A search result observation circuit 126 is connected to 
the bus 122 and the hit and multiple match output terminals of the furthest downstream CAM chip 120. 
[0043] Because the hit and multiple match results HT and MT provide information on the status of all or a portion of 
55 the system, rather than the status of a single chip, another means must be provided for determining the status of a given 
chip. A useful piece of status information is the ordinal location of the chip that has driven its result onto the bus 122 
(i.e., the highest-priority chip with a match). The binary representation of the ordinal of the chip may be determined/in 
one possible way, by logic circuitry shown in Figure 3. This solution requires logic circuitry external to the CAM chips v 
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forming the cascade, using the hit signals as inputs. 

[0044] Figure 3 shows the cascaded CAM chips with external logic circuits including n AND gates for determining 
which CAM chip is driving the common bus. Referring to Figure 3, each of the AND gates 128 has inverting and non- 
inverting input terminals. The hit input and output terminals of the CAM chip 1 20 are connected to the inverting and non- 
s inverting input terminals of the respective AND gate 128. The n output signals from the AND gates 128 are fed to an n- 

[OMsf Wtem^tively. the AND gates 128 may be integrated on-chip, with an additional pin on the CAM chip 120 pro- 
vided to indicate whether the particular chip has the highest-priority hit, and is driving the bus 122. Also, search address 
results may be stored on-chip in registers (not shown). The output of the encoder 130 may be used to determine which 

io chip's result register is read. 

[0046] Above it was stated in design requirement (5) that on-chip circuitry has to ensure that there is no contention 
on the common bus; i.e.. more than one chip is never attempting to drive the bus 122 at one time. In order to inplement 
this functionality on-chip, an internal self-timed signal is introduced. This signal goes low following the ris.ng edge of the 
clock signal ck which initiates the search operation. It rises after valid data is present on the internal address bu* and 

is on the internal hit status signal ht. Hence, transitions on the self-timed signal can be made to model those on the hit 
signal ht. Timing of these transitions is shown in Figure 4. 

[0047] Given the signals shown in Figure 4, system hit status is propagated and it is ensured that only a single chip 
drives the shared result bus. This is accomplished with the logic shown in Figure 5 which shows how inter-chip signals 

[Wm^Fi^r e 5 shows a CAM array with logic circuits for generating inter-chip signals. Referring to Figure 5, the CAM 
chip 120 contains a CAM array 1 10. an AND gate 132 with one inverting input, a transfer gate 134. an OR gate 136 with 
one inverting input and a buffer 138. A propagation-in hit signal ht from an off-chip signal HTI is fed to the inverting input 
terminal of the AND gate 132 and the OR gate 136. An internal self-timed signal st is provided to the AND gate 132 and 
the inverting input terminal of the OR gate 136. A hit signal ht from the CAM array 1 10 is fed to the AND gate 132 and 
the OR gate 1 36. A match address enable signal sae is fed from the AND gate 132 to the transfer gate 1 34. An address 
signal sa from the CAM array 110 is fed to the transfer gate 134. which prevents the address signal from passing 
through the gate when the enable signal sae is low. An off-chip address signal SA is provided by the transfer gate 134 
A propagation-out hit signal hto from the OR gate 136 is fed to the buffer 138 which in turn provides an off-chip hrt signal 
HT 

30 [0049] During the time interval when st=0 on all chips in the system, no chips are enabled to drive the bus 122. During 
the same interval, all hit signals HT in the cascade are at logic 1 . due to st=0. doubly disabling SA output drrvers through 
the propagation-in hit signal hti. This partial redundancy may be removed by re-timing the signals and decreasing the 
number of inputs to the gates. Note that such an approach would lead to a less robust design. 
[0050] Waveforms of all of the relevant signals on a single chip are shown in Figures 6A - 6D. for the four different 

35 cases of the propagation-in hit signal hti (transition from 0 to 0. 1 to 0. 0 to 1. and 1 to 1). Note that the propagaton-m 
hit signal hti is the on-chip propagation of the off-chip signal HTI. u - 
[0051] As can be seen, correct operation is independent of (a) speed differences between chips and (b) inter-chip 
routing delay, because de-selection occurs on-chip, and only selection is gated by upstream off-chip signals. This fea- 
ture also supports expandability, as additional chips added to a system may be subject to different processing condi- 

40 tions. or even a completely different fabrication technology. . 
[0052] When worst-case timing is characterized, the slowest path to selection will be from the propagation-in hit signal 
HTI input The downward transition on the propagation-in hit signal HTI may further propagate to the propagation-out 
hit signal HT (assuming ht=0), such that the worst-case system performance is equal to that of a single chip standing 
alone, plus (n-2) times the propagation-in hit signal HTI -to-HT delay plus the propagation-in hit signal HTI-to-SA delay. 

45 System performance can be characterized by the following expressions: 

tCH-SAV = tCH-HTV + (n-2) x tHTIL-HTL + tHTIL-SAV 

tCH-SAVSYS = tCH-HITV + (n-2) x tHITIL-HITL + tHITIL-SAV 

tCH-SHTV = tCH-HTV + (n-1) x tHTIL-tHTL 

tCH-HITVSYS = tCH-HITV + (n-1) x tHtTIL-HITL 

[0053] Note that, without the self-timed signal st. disabling and enabling SA drive would be dependent on HTI timing. 
Bus contention would be difficult to prevent, and disable timing would depend on a chip's position m the cascade. 
[0054] The multiple match signal MT. if present, must propagate in a similar way. with logic on each chip, in one exam- 
ple shown in Figure 7 which illustrates logic gates to implement MT propagation in a cascaded CAM. Referring to Fig- 
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ure 7, the hit signal ht and the multiple match signal mt from the CAM array 110 are fed to an AND gate 142 and an OR 
gate 144, respectively. The propagation-in hit signal hti is fed to the AND gate 142, the output signal of which is fed to 
the OR gate 144. The multiple match signal mt is fed to the OR gate 144, the output signal of which is fed to a buffer 
146. The off -chip multiple match signal MT is provided by the buffer 146. 

5 

VI Generator Of A Self -Timed Signal 

[0055] There are many possible circuits of self-timed signal generators. It is the intended scope of this invention to 
subsume any such circuit, provided the resulting self-timed signal st is employed as described above to enable conten- 
w tion-free result bus sharing. 

[0056J In the description that follows, disclosed are examples of self-timed signal generators. These examples are 
meant to provide a broad view of implementation possibilities, and their descriptions in no way limit the scope of the 
foregoing part of this patent disclosure. 

[0057] In a first example of a self-timed signal generator, shown in Figure 8. a simple delay line models the expected 
is delay in the generation of a hit signal. Referring to Figure 8, the clock signal ck is fed to the reset input terminal R of a 
flip-flop 1 52. Also, the clock signal ck is fed to the set input terminal S of the flip-flop 1 52 through a chain of four buffers 
154. 

[0058] The falling edge of the self -timed signal st is generated by the rising edge of the clock signal ck f while the rising 
edge of the self-timed signal st is generated by a delayed version of the rising edge of the clock signal ck. The S/R 

20 (set/reset) latch shown in this and subsequent figures represents a logical function, and not necessarily a physical real- 
ization. Timing both edges of the self-timed signal st from the rising edge of the dock signal ck results in duty cycle inde- 
pendence. The delay of the delay chain can be set equivalent to the delay between the rising edges of the clock signal 
ck and the hit signal ht. Alternatively, if the clock signal ck duty cycle is known and well controlled, timing of the rising 
edge of the self-timed signal st may be controlled by the falling edge of the clock signal ck. Note that hit timing must be 

25 predictable, in order to employ this example; it is not appropriate in a modular or scalable design, in which the hit signal 
delay may vary from implementation to implementation. 

VII. Examples Of CAM Arrays 

30 VII-1. First Example 

[0059] Figure 9 shows a first example of the CAM array which is implemented in the CAM chip 220. In the CAM array, 
a single chip CAM of w (=4) words x b (=4) bits is implemented as an array with w rows and b columns. The CAM array 
includes w x b (=16) core cells 230, each cell being at the intersection of a match line 232 and a pair of bit lines 234. A 

35 pair of bit lines 234 carry differential data representing a single bit, rather than two bits of data. Each core cell 230 acts 
to store a single bit of data and is capable of performing a single-bit comparison (logical exclusive NOR (XNOR)) oper- 
ation, in addition to its bit storage capability. In Figure 9, the cells 230 belonging to a given word are connected to the 
match line of that word in a logical NOR fashion. The structure of the CAM array is known. See a paper by K.J. Schultz 
et al. entitled "Architectures for Large-Capacity CAMs". INTEGRATION: the VLSI Journal. Vol. 1 8, pp. 1 51 -1 71 , 1995, 

40 which is incorporated herein by reference. 

[0060] The bit lines for differential data are connected to reference word storage and bit line drivers 236 which receive 
input data D for loading the contents of the CAM array and for the search reference word. Data stored in the array's core 
cells 230 are searched by applying a reference word on the bit lines 234. 

[0061] When differential data is asserted on a pair of bit lines 234 in a search operation, the core cell 230 compares 
45 its stored data bit with this differential data (also known as reference data, or a single bit of the comparand). When the 
stored data is not equal to the reference data, the core cell 230 pulls the match line 232 (which is precharged to a logical 
high state) down to a low state. When the stored data is equal to the reference data, the cell 230 has no effect on the 
match line 232 to which it is connected. Because all b core cells 230 in a given word are connected to the match line 
232 in the same way, the match line 232 will be pulled low if any bit in its word is unequal to (or mismatches) the corre- 
so spending reference bit. The match line 232 remains in a logical high state only if all bits in its word match their corre- 
sponding reference bits. 

[0062] The CAM array includes an encoder 238 which is connected to the match lines 234. The encoder 238 pro- 
duces three outputs that represent the result of the search operation. The "ht" signal is asserted to a logical high state 
if any of the w words is storing data which has matched the reference data. The binary address of this matching word 
55 is encoded onto the "sa w output. In the event that a plurality of wads have matched the reference data, the multiple 
match signal "mt" is asserted to a logical high state. In this event, the address sa output of the encoder 238 may pro- 
duce (a) an invalid result, (b) an address representing the location of a single one of the multiple matches, or (c) a 
sequence of outputs, representing the locations of each of the matched words. Note that some applications may not 



7 

' )CID: <£P 0899743A2_I_> 



EP 0 899 743 A2 

require the multiple match result, and all references to the multiple match function may be eliminated from this disclo- 
sure, without loss of utility. 

VI1-? Second Example 

s 

[0063] Figure 10 shows a second example of the CAM array which is implemented in a CAM chip 320. In the CAM 
array, the words are divided into two halves, and the results of the match on each half word are combined. Each of the 
two halves is provided with an array of 4 rows x 4 columns. The array includes 16 core cells 330, each being at the inter- 
section of a match line 332 and a pair of bit lines 334 which carry differential data representing a single bit. The bit lines 
to 334 for differential data are connected to reference word storage and bit line drivers 336 which receive input data D for 
loading the contents of the CAM array and for the search reference word. Data stored in the array's core cells 330 are 
searched by applying a reference word on the bit lines 334. 

[0064] Each core cell 330 acts to store a single bit of data and is capable of performing a single-bit comparison (logical 
exclusive NOR (XNOR)) operation, in addition to its bit storage capability. In Figure 10. the cells 330 belonging to a 

is given word are connected to the match line of that word in a logical NAND fashion. The core cells 330 of each word are 
chained in the respective match line 332. Each of the match lines 332 of one half is connected via an inverter 338 to an 
AND gate 320, the output terminal of which is connected via a multiple match line 342 to an encoder 344. 
[0065] In Figure 1 0, the connection (in each half word) is in a logical NAND. The match line 332 will only have a down- 
ward transition, if all of the bits in the half word are equal to the reference data. Hence, the path to ground for the match 

20 line 332 is serial (a "match line chain") rather than parallel, and the path is made conductive (i.e., the circuit is closed) 
in the event of a match, rather than a mismatch. 

[0066] The advantage of this technique is due to the much smaller number of match lines 332 subject to a transition 
in each search operation: one per match in the example shown in Figure 10, compared to one per mismatch in the prior 
art circuit shown in Figure 10. This reduces power dissipation considerably, allowing the realization of larger storage 
25 capacities. The division of the word into halves decreases the length of the NAND chain, thus increasing speed. 

[0067] The example of a CAM array shown in Figure 10 also includes means of placing multiple words in a physical 
row, by employing an upper metal layer above the core cell for the multiple match lines 342. This further increases the 
storage capacity that can be realized. 

[0068] The CAM array produces three outputs ht sa and mt that represent the result of the search operation, and 
30 these may ail be generated by the encoder 344. The "hr signal is asserted to a logical high state rf any of the w words 
is storing data which has matched the reference data. The binary address of this matching word is encoded onto the 
"sa" output. In the event that a plurality of words have matched the reference data, the multiple match signal "mf is 
asserted to a logical high state. In this event, the address (sa) output of the encoder may produce (a) an invalid result, 
(b) an address representing the location of a single one of the multiple matches, or (c) a sequence of outputs, repre- 
ss senting the locations of each of the matched words. 

[0069] Note that there are many other possible CAM arrays, and the invention herein described may be used to add 
cascadability to any of these examples. 

[0070] The basic function of a single CAM array does not change from that of the examples described above, in the 
case when a plurality of arrays (on a plurality of chips) are cascaded together to realize a capacity larger than that which 
40 may be realized by a single array on a single chip. 

VII-3. Third Example 

[0071] Figure 1 1 shows a third example of the CAM array. It produces a "modelmiss" signal which tracks variable hit 
45 delay, and is an extension of the first CAM array described above (and shown in Figure 9). Figure 1 2 shows the accom- 
panying self-timed signal generator. 

[0072] The falling edge of the self-timed signal st is initiated by the rising edge of the clock signal ck (directly or indi- 
rectly), and the rising edge of the self-timed signal st is initiated by the signal modelmiss. An extra row is provided in the 
CAM array for the purpose of generating timing information. The core cells along this row are modifications of the stand- 
so ard core cell 410 (identical to the core cell 230 in Figure 9). An always-miss core cell 420 is placed at the end of the 
model match line 422 farthest from the encoder 41 8. while the rest of the row is populated with always-match core cells 
430. The slowest possible (single-word) search result in a standard NOR match line implementation is a single-bit miss, 
and it is modelled by this arrangement. The downward transition on the modelmiss signal corresponds with the instance 
at which valid data is guaranteed on the hit signal ht. Note that, in order to generate the hit signal polarities shown in 
55 previous timing diagrams, the hit signal should also initially be driven low in a preconditioning transition immediately fol- 
lowing the rising edge of the dock signal ck. 
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VII-4 Fourth Example 

[0073] A fourth example includes circuitry to track hit delay in a CAM array implemented according to the second CAM 
array described above (and shown in Figure 10). This is shown in Figure 13. Accompanying self -timed signal genera- 

5 tors are shown in Figures 14A and 14B. The falling edge of the self-timed signal st is initiated by the rising edge of the 
clock signal ck (directly or indirectly), and the rising edge of the self-timed signal st is initiated by the signal modelhit. 
An extra row is provided in the CAM array for the purpose of generating timing information. The core cells along this row 
are modifications of the standard core cell 510 (identical to the core cell 330 in Figure 10). An aiways-match core cell 
512 is used throughout model match line chains 514. The slowest possible (single-word) search result in this NAND 

io. match line implementation is a match, and it is modelled by this arrangement. The upward transition on the modelhit 
signal corresponds to the instance at which valid data is guaranteed on the hit signal ht. Note that, in order to generate 
the hit signal polarities shown in previous timing diagrams, the hit signal should also initially be driven low in a precon- 
ditioning transition immediately following the rising edge of the dock signal ck. Because modelhit has the same polarity 
as the self-timed signal st, the self-timed signal drive circuit may be removed, and modelhit may be used as the self- 

is timed signal st, as shown in Figure 14B. 

VII-5. Fifth Example 

[0074] Figure 1 5 is a block diagram of a circuit using a plurality of word slices using the same match logic as the sec- 

20 ond CAM array where a serf-timed signal generation in this arrangement employs a model global data line. 

[0075] A fifth example of the self-timed signal generator can be used to track hit delay in a CAM array with a plurality 
of the vertical word slices employed in the second CAM array, as shown in Figure 1 5. The plurality of word slices allows 
the realization of larger capacities. Because all bits of the comparand D must be bused to all word slices, a global data 
bus 622 is employed. Timing information may be embedded in the global data bus 622. in the form of a model global 

25 data line 624. The model global data line 624 is driven such that its first transition approximately coincides with the 
downward transition of the hit signal ht The model global data line 624 is also used to provide timing information to the 
CAM array, guaranteeing this coincidence. The falling edge of the self-timed signal st is initiated by the first transition of 
the model data line, and the rising edge of the serf-timed signal st is initiated by the signal modelhit. An extra row is pro- 
vided in the word slice of the CAM array farthest from the encoder, for the purpose of generating timing information. The 

30 core cells along this row are modifications of the standard core cell 610 (identical to the core cell 330 in Figure 10). An 
always-match core cell 612 is used throughout model match line chains 614. The slowest possible (single-word) search 
result in this NAND match line implementation is a match, and it is modelled by this arrangement. The upward transition 
on the modelhit signal corresponds with the instance at which valid data is guaranteed on the hit signal ht. Note that, in 
order to generate the hit signal polarities shown in previous timing diagrams, the hit signal ht should also initially be 

35 driven low in a preconditioning transition following the rising edge of the clock signal ck. with its timing governed by the 
model global data line. As in the previous two examples, delay of the modelhit signal through the encoder is meant to 
track that of the hit signal ht through the encoder. Where the encoder is realized with random logic, this may be achieved 
by a delay chain. Where the encoder is realized as a read-only memory (ROM), delay matching may be achieved with 
a model ROM bit line. 

40 [00761 Note that a NOR match line CAM array may also be implemented in a word slice fashion to achieve higher 
capacities. The self-timed signal may be generated by combining the model match line from Figure 1 1 with the model 
global data line from Figure 1 5. 

[0077] Figure 1 6 illustrates logic gates to implement appropriate full-chip timing of the self -timed signal st. in the case 

when a CAM chip comprises multiple CAM arrays. 
45 [0078] When the CAM on each chip is composed of multiple arrays, each CAM array 71 0 has its own "hlT and "stj " 

signals, it is necessary to widen the duration of the pulse of the self-timed signal st, to allow for the delay in combining 

individual signals into the hit signal ht. Figure 16 shows an example of this pulse-widening, in which a single stj signal 

is logically ORed by OR gate 712 with a delayed version of the same stj (referred to as "strrf ). said version having 

passed through a delay similar to that of the array hit signal htj to ht. 
so [0079] Another example, shown in Figure 17, brings both stm and stj to the propagation-out hit signal hto gate (see 

Figure 5). Both examples of Figures 16 and 17 prevent a downward glitch on the propagation-out hit signal hto when 

the propagation-in hit signal hti =0 and hit makes an upward transition following stj. This same sequence of transitions 

is not consequential to the sae gate of Figure 5. and no changes to it. similar to Figure 17. are required. 

[0080] The CAM array is not limited to ones shown in Figures 9. 10, 1 1. 13 and 15. There are many variations. For 
55 example, the data comparison function of a CAM array is not performed by the core cells and is performed by separate 

comparators placed adjacent to the core cells. Such a CAM array is described in United States Patent Application No. 

08/748,928 entitled "Large-Capacity Content Addressable Memory", filed on November 14, 1996 by K.J. Schultz et al, 

which is incorporated herein by reference. 
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[0081] In the system in which a plurality of CAM chips are cascaded according to the embodiment of the present 
invention, status registers, address result propagation, last-device identification, or storage of common entries are not 
employed. 

[0082] It is understood that there are many possible variations in embodiment detail that are logically subsumed by 
5 this invention disclosure, including different signal polarities, equivalent Boolean gate-level implementations, small tim- 
ing variations, and so on. 

Claims 

10 1. A system comprising: 
a common bus; and 

a plurality of content addressable memory (CAM) chips which are cascaded and connected to the bus, each 
of the CAM chips comprising an array of core cells of w words x b bits and encoding means, the core cell com- 
15 prising data storage means, the CAM chip being able to provide through the encoding means hit and match 

address signals resulted from search operation, 
each of the CAM chips comprising: 
means for propagating the hit signal from chip to chip; and 
means for providing the address signal to the bus. 

20 

2. The system of claim 1, wherein the CAM chip further comprises driving means for deciding which CAM chip is 
allowed to provide the address signal to the bus. 

3. The system of claim 2, wherein the driving means comprises means for preventing more than one CAM chip from 
25 providing the address signal to the bus simultaneously. 

4. The system of claim 2. wherein the chip comprises means for causing the driving means to be disabled based on 
signals generated on-chip and enabled by a signal propagating from upstream in the cascade. 

30 5. The system of claim 1 , further comprising means for observing the search results for the cascaded CAM chips from 
the hit signal of the furthest-downstream chip and the match address signal on the bus. 

6. The system of claim 1 , further comprising means for determining the priority of chips in the cascaded CAM chips 
by position, such that the further upstream a chip is, the higher the priority. 

35 

7. The system of claim 1 , wherein each of the CAM chips further provides multiple match indication on search oper- 
ation through the encoding means and further comprises means for propagating the multiple match indication from 
chip to chip. 

40 8. The system of claim 1 , wherein the CAM chip further comprises means for providing the match address signal to 
the bus in a sequence of subsequent cycles in a case of more than one match address. 

9. The system of claim 1 . further comprising means for encoding the ordinal location of the highest-priority chip with 
a match from among the cascaded multiple CAM chips. 

45 

1 0. The system of claim 1 , wherein the CAM chip further comprises self-timing means for generating a self-timed signal 
for the chip's operation, in response to a clock signal. 

1 1 . The system of claim 1 0, wherein the CAM chip further comprises means tor determining the hit output of each CAM 
so chip being logically determined as the logic OR of three signals: 

the inverted self-timed signal; 
the hit input signal; and 
the on-chip hit result. 

55 

12. The system of claim 11, further comprising a four-input OR gate, wherein the self-timed signal being optionally 
decomposed into two inputs of the four-input OR gate, with one input being a delayed self-timed signal so as to 
widen the self-timed signal pulse to allow fa the delay due to the combination of search results from among mufti- 
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pie on-chip CAM arrays. 

1 3. The system of claim 1 1 , wherein the CAM chip comprises a plurality of CAM arrays and the self-timed signal is the 
logic OR of: 

a fast self-timed signal generated directly by one of the CAM arrays; and 

a slow self-timed signal generated by logic circuitry which models the delay experienced by a hit signal logic 
circuit. 

14. The system of claim 10, wherein the CAM chip further comprises self -timing means for generating a self -timed sig- 
nal for the chip's operation, in response to a clock signal. 
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(54) Content addressable memory system 

(57) A system includes a plurality of content 
addressable memory (CAM) chips which are cascaded 
and connected to a common bus. Each of the CAM 
chips provides search results (hit, match address and 
multiple match). A hit signal and a multiple match signal 
are propagated from chip to chip. A system hit result is 



given from the furthest down stream CAM chip. The 
match address result of the system is given from the 
common bus, where on-chip self-timed signals guaran- 
tee that there is no driving contention on the bus. 
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